Gradient based motion estimation

ABSTRACT

A technique for generating motion vectors for applications is requiring field of frame rate interpolation and especially in standards conversion. The image gradients on the same standard as the input video or film signal and then vertical/temporal interpolators are used to convert to the output standard before determining the motion vectors. This allows motion vectors to be easily calculated on the output standard.

The invention relates to a technique for estimating motion vectors for avideo sequence and, in particular, to a motion estimator for use with astandards converter. The technique has application in any apparatusrequiring field or frame rate interpolation, for example, slow motiondisplay apparatus and conversion of film sequences to interlaced videosequences.

Gradient motion estimation is one of three or four fundamental motionestimation techniques and is well known in the literature (references 1to 18) . More correctly called ‘constraint equation based motionestimation’ it is based on a partial differential equation which relatesthe spatial and temporal image gradients to motion.

Gradient motion estimation is based on the constraint equation relatingthe image gradients to motion. The constraint equation is a directconsequence of motion in an image. Given an object, ‘object (x, y)’,which moves with a velocity (u, v) then the resulting moving image, I(x,y, t) is defined by Equation 1;

I(x,y,t)=object (x−ut, y−vt)

This leads directly to the constraint equation. Equation 2;${{u \cdot \frac{\partial{I\left( {x,y,t} \right)}}{\partial x}} + {v \cdot \frac{\partial{I\left( {x,y,t} \right)}}{\partial y}} + \frac{\partial{I\left( {x,y,t} \right)}}{\partial t}} = {\frac{\partial{{object}\left( {x,y} \right)}}{\partial t} = 0}$

where, provided the moving object does not change with time (perhaps dueto changing lighting or distortion) then object/t=0. This equation is,perhaps, more easily understood by considering an example. Assume thatvertical motion is zero, the horizontal gradient is +2 grey levels perpixel and the temporal gradient is −10 grey levels per field. Then theconstraint equation says that the ratio of horizontal and temporalgradients implies a motion of 5 pixels/field. The relationship betweenspatial and temporal gradients is summarised by the constraint equation.

To use the constraint equation for motion estimation it is firstnecessary to estimate the image gradients; the spatial and temporalgradients of brightness. In principle these are easily calculated byapplying straightforward linear horizontal, vertical and temporalfilters to the image sequence. In practice, in the absence of additionalprocessing, this can only really be done for the horizontal gradient.For the vertical gradient, calculation of the brightness gradient isconfused by interlace which is typically used for television pictures;pseudo-interlaced signals from film do not suffer from this problem.Interlaced signals only contain alternate picture lines on each field.Effectively this is vertical sub-sampling resulting in vertical aliasingwhich confuses the vertical gradient estimate. Temporally the situationis even worse, it an object has moved by more than 1 pixel inconsecutive fields, pixels in the same spatial location may be totallyunrelated. This would render any gradient estimate meaningless. This iswhy gradient motion estimation cannot, in general, measure velocitiesgreater than 1 pixel per field period (reference 8).

Prefiltering can be applied to the image sequence to avoid the problemof direct measurement of the image gradients. If spatial low passfiltering is applied to the sequence then the effective size of ‘pixels’is increased. The brightness gradients at a particular spatial locationare then related for a wider range of motion speeds. Hence spatial lowpass filtering allows higher velocities to be measured, the highestmeasurable velocity being determined by the degree of filtering applied.Vertical low pass filtering also alleviates the problem of verticalaliasing caused by interlace. Alias components in the image tend to bemore prevalent at higher frequencies. Hence, on average, low passfiltering disproportionately, removes alias rather than true signalcomponents. The more vertical filtering that is applied the less is theeffect of aliasing. There are, however, some signals in which aliasingextends down to zero frequency. Filtering cannot remove all the aliasingfrom these signals which will therefore result in erroneous verticalgradient estimates and, therefore, incorrect estimates of the motionvector.

Prefiltering an image sequence results in blurring. Hence small detailsin the image become lost. This has two consequences, firstly thevelocity estimate becomes less accurate since there is less detail inthe picture and secondly small objects cannot be seen in the prefilteredsignal. To improve vector accuracy hierarchical techniques are sometimesused. This involves first calculating an initial, low accuracy, motionvector using heavy prefiltering, then refining this estimate to higheraccuracy using less prefiltering. This does, indeed, improve vectoraccuracy but it does not overcome the other disadvantage ofprefiltering, that is, that small objects cannot be seen in theprefiltered signal, hence their velocity cannot be measured. No amountof subsequent vector refinement, using hierarchical techniques, willrecover the motion of small objects if they are not measured in thefirst stage. Prefiltering is only advisable in gradient motionestimation when it is only intended to provide low accuracy motionvectors of large objects.

Once the image gradients have been estimated the constraint equation isused to calculate the corresponding motion vector. Each pixel in theimage gives rise to a separate linear equation relating the horizontaland vertical components of the motion vector and the image gradients.The image gradients for a single pixel do not provide enough informationto determine the motion vector for that pixel. The gradients for atleast two pixels are required. In order to minimise errors in estimatingthe motion vector it is better to use more than two pixels and find thevector which best fits the data from multiple pixels. Consider takinggradients from 3 pixels. Each pixel restricts the motion vector to aline in velocity space. With two pixels a single, unique, motion vectoris determined by the intersection of the 2 lines. With 3 pixels thereare 3 lines and, possibly, no unique solution. This is illustrated inFIG. 1. The vectors E₁ to E₃ are the error from the best fitting vectorto the constraint line for each pixel.

One way to calculate the best fit motion vector for a group ofneighboring pixels is to use a least mean square method, that isminimising the sum of the squares of the lengths of the error vectors(E₁ to E₃ FIG. 1). The least mean square solution for a group ofneighbouring pixels is given by the solution of Equation 3;${\begin{bmatrix}\sigma_{xx}^{2} & \sigma_{xy}^{2} \\\sigma_{xy}^{2} & \sigma_{yy}^{2}\end{bmatrix} \cdot \begin{bmatrix}u_{0} \\v_{0}\end{bmatrix}} = {- \begin{bmatrix}\sigma_{xt}^{2} \\\sigma_{yt}^{2}\end{bmatrix}}$

where${\sigma_{xx}^{2} = {\sum{\frac{\partial I}{\partial x} \cdot \frac{\partial I}{\partial x}}}},{\sigma_{xy}^{2} = {\sum{\frac{\partial I}{\partial x} \cdot \frac{\partial I}{\partial y}}}}$

etc

where (u₀, v₀) is the best fit motion vector and the summations are overa suitable region. The (direct) solution of equation 3 is given byEquation 4; $\begin{bmatrix}u_{0} \\v_{0}\end{bmatrix} = {\frac{1}{{\sigma_{xx}^{2}\sigma_{yx}^{2}} - \sigma_{xy}^{4}}\begin{bmatrix}{{\sigma_{xy}^{2}\sigma_{yt}^{2}} - {\sigma_{yy}^{2}\sigma_{xt}^{2}}} \\{{\sigma_{xy}^{2}\sigma_{xt}^{2}} - {\sigma_{xx}^{2}\sigma_{yt}^{2}}}\end{bmatrix}}$

Small regions produce detailed vector fields of low accuracy and viceversa for large regions. There is little point in choosing a regionwhich is smaller than the size of the prefilter since the pixels withinsuch a small region are not independent.

Typically, motion estimators generate motion vectors on the samestandard as the input image sequence. For motion compensated standardsconverters, or other systems performing motion compensated temporalinterpolation, it is desirable to generate motion vectors on the outputimage sequence standard. For example when converting between Europeanand American television standards the input image sequence is 625 line50 Hz (interlaced) and the output standard is 525 line 60 Hz(interlaced). A motion compensated standards converter operating on aEuropean input is required to produce motion vectors on the Americanoutput television standard.

It is an object of the present invention to provide a method andapparatus capable of generating motion vectors on an output standarddifferent from the input standard. This is achieved by first calculatingimage gradients on the input standard and then converting thesegradients to the output standard before implementing the rest of themotion estimation process.

The direct implementation of gradient motion estimation, discussedherein in relation to FIGS. 2 and 3, can give wildly erroneous results.Such behaviour is extremely undesirable. These problems occur when thereis insufficient information in a region of an image to make an accuratevelocity estimate. This would typically arise when the analysis regioncontained no detail at all or only the edge of an object. In suchcircumstances it is either not possible to measure velocity or onlypossible to measure velocity normal to the edge. It is attempting toestimate the complete motion vector, when insufficient information isavailable, which causes problems. Numerically the problem is caused bythe 2 terms in the denominator of equation 4 becoming very similarresulting in a numerically unstable solution for equation 3.

A solution to this problem of gradient motion estimation has beensuggested by Martinez (references 11 and 12). The matrix in equation 3(henceforth denoted ‘M’) may be analysed in terms of its eigenvectorsand eigenvalues. There are 2 eigenvectors, one of which points parallelto the predominant edge in the analysis region and the other pointsnormal to that edge. Each eigenvector has an associated eigenvalue whichindicates how sharp the image is in the direction of the eigenvector.The eigenvectors and values are defined by Equation 5;${where};\quad {M = \begin{bmatrix}\sigma_{xx}^{2} & \sigma_{xy}^{2} \\\sigma_{xy}^{2} & \sigma_{yy}^{2}\end{bmatrix}}$

The eigenvectors e_(i) are conventionally defined as having length 1,which convention is adhered to herein.

In plain areas of the image the eigenvectors have essentially randomdirection (there are no edges) and both eigenvalues are very small(there is no detail). In these circumstances the only sensible vector toassume is zero. In parts of the image which contain only an edge featurethe eigenvectors point normal to the edge and parallel to the edge. Theeigenvalue corresponding to the normal eigenvector is (relatively) largeand the other eigenvalue small. In this circumstance only the motionvector normal to the edge can be measured. In other circumstances, indetailed parts of the image where more information is available, themotion vector may be calculated using Equation 4.

The motion vector may be found, taking into account Martinez' ideasabove, by using Equation 6; $\begin{bmatrix}u_{0} \\v_{0}\end{bmatrix} = {{- \left( {{\frac{\lambda_{1}}{\lambda_{1}^{2} + n_{1}^{2}}\quad e_{1}e_{1}^{t}} + {\frac{\lambda_{2}}{\lambda_{2}^{2} + n_{2}^{2}}\quad e_{2}e_{2}^{t}}} \right)} \cdot \begin{bmatrix}\sigma_{xt}^{2} \\\sigma_{yt}^{2}\end{bmatrix}}$

where superscript t represents the transpose operation. Here n₁ & n₂ arethe computational or signal noise involved in calculating λ₁ & λ₂respectively. In practice n₁≈n₂, both being determined by, andapproximately equal to, the noise in the coefficients of M. When λ₁ &λ₂<<n then the calculated motion vector is zero; as is appropriate for aplain region of the image. When λ₁>>n and λ₂<<n then the calculatedmotion vector is normal to the predominant edge in that part of theimage. Finally if λ₁, λ₂>>n then equation 6 becomes equivalent toequation 4. As signal noise, and hence n, decreases then equation 6provides an increasingly more accurate estimate of the motion vectors aswould be expected intuitively.

In practice calculating motion vectors using the Martinez techniqueinvolves replacing the apparatus of FIG. 3, below, with more complexcircuitry. The direct solution of equation 6 would involve dauntingcomputational and hardware complexity. It can, however, be implementedusing only two-input, pre-calculated, look up tables and simplearithmetic operations. It is another object of the present invention toprovide a streamlined implementation of the Martinez technique.

The invention provides motion vector estimation apparatus for use invideo signal processing comprising means for calculating image gradientsfor each input sampling site of a picture sequence, the image gradientsbeing calculated on the same standard as the input signal, means forconverting the image gradients from the first standard to a second,output standard, and means for generating a plurality of motion vectorsfrom the image gradients, the apparatus being arranged to convert theimage gradients from the input standard to the output standard beforecalculation of motion vectors thereby producing motion vectors on thedesired output standard. The motion vectors are calculated on the outputstandard thereby avoiding the difficulties and inaccuracies involved inconverting the signals to the output standard after calculation of themotion vectors.

The apparatus may comprise temporal and spatial low pass filters forprefiltering the input video signal. Prefiltering increases the maximummotion speed which can be measured and reduces the deleterious effectsof vertical/temporal aliasing.

The means for calculating the image gradients may comprise temporal andspatial (horizontal and vertical) differentiators.

The means for converting the image gradients from the input standard tothe output standard comprise vertical/temporal interpolators. Forexample a linear (polyphase) interpolator such as a bilinearinterpolator.

The image gradients corresponding to a plurality of output samplingsites are used to calculate the motion vectors. The motion vectors maybe calculated using a least mean square solution for a group ofneighbouring output sampling sites.

In an embodiment the apparatus further comprises a multiplier arrayhaving as its inputs the image gradients previously calculated andconverted to the output standard, and corresponding low pass filters forsumming the image gradient products. The means for calculating themotion vectors utilises the sums of the image gradient productscorresponding to a group of neighbouring output sampling sites toproduce the best fit motion vector for the group of sampling sites. Adifferent group of neighbouring sampling sites may be used to calculateeach motion vector. The means for calculating motion vectors, determinesthe best fit motion vector given by equation 4 or equation 6 as hereindefined.

In an alternative embodiment the apparatus comprises rectangular topolar coordinate converter means having the spatial image gradientsconverted to the output standard as its inputs and the motion vectorsare determined for a group of output sampling sites based on the angleand magnitude of the image gradients of each sampling site in saidgroup. The motion vectors being calculated on the basis of equation 11or 13 as herein defined.

The invention also provides a method of motion estimation in video orfilm signal processing comprising calculating image gradients for eachinput sampling site of a picture sequence, the image gradients beingcalculated on a first, input standard, generating a plurality of motionvectors from the image gradients, the image gradients being converted toa second, output standard before generating the motion vectors therebygenerating motion vectors on the desired output standard.

The method may comprise a prefiltering step. The input video signal maybe prefiltered for example using temporal and spatial lowpass filters.

The image gradients corresponding to a plurality of output samplingsites are used to calculate the motion vectors. The motion vectors maybe calculated using a least mean square solution for a group ofneighbouring sampling sites.

The step of generating motion vectors may comprise using the sums of theimage gradient products corresponding to a group of neighbouring outputsampling sites to produce the best fit motion vector for each saidgroup. The motion vectors may be calculated using equation 4 or 6 asdefined herein.

In an embodiment the step of generating motion vectors may compriseperforming eigen-analyses on the sums of the image gradient productsusing the spatial image gradients converted to the output standard andassigning two eigenvectors and eigenvalues to each output sampling site.The motion vector for each group of sampling sites is calculated byapplying equation 6, as herein defined, to the results of the eigenanalyses.

In another embodiment the step of generating motion vectors comprisestransforming the spatial image gradient vectors on the output standardfrom rectangular to polar coordinates and the motion vectors aredetermined for a group of output sampling sites based on the angle andmagnitude of the image gradients of each sampling site in said group.The motion vectors being calculated on the basis of equation 11 or 13 asherein defined.

The invention will now be described in more detail with reference to theaccompanying drawings in which:

FIG. 1 shows graphically the image gradient constraint lines for threepixels.

FIGS. 2 and 3 are a block diagram of a motion estimator according to anembodiment of the invention.

FIG. 4 is a block diagram of apparatus for calculating motion vectorswhich can be substituted for the apparatus of FIG. 3.

FIG. 5 is a block diagram of apparatus for implementing the eigananalysis required in FIG. 4.

FIGS. 6 and 7 show another embodiment of the gradient motion estimationapparatus according to the invention.

FIG. 8 shows graphically the distribution of errors in the case of abest fit motion vector.

FIGS. 9 and 10 are block diagrams of apparatus capable of providing anindication of the error of motion vectors in a motion estimation system.

A block diagram of a direct implementation of gradient motion estimationis shown in FIGS. 2 & 3.

The apparatus shown schematically in FIG. 2 performs filtering andcalculation of gradient products and their summations. The apparatus ofFIG. 3 generates motion vectors from the sums of gradient productsproduced by the apparatus of FIG. 2. The horizontal and vertical lowpass filters (10,12) in FIG. 2 perform spatial prefiltering as discussedabove. The cut-off frequencies of {fraction (1/32)}nd band horizontallyand {fraction (1/16)}th band vertically allow motion speeds up to (atleast) 32 pixels per field to be measured. Different cut-off frequenciescould be used if a different range of speeds is required. The imagegradients are calculated by three temporal and spatial differentiators(16,17,18).

The vertical/temporal interpolation filters (20) convert the imagegradients, measured on the input standard, to the output standard.Typically the vertical/temporal interpolators (20) are bilinearinterpolators or other polyphase linear interpolators. Thus the outputmotion vectors are also on the output standard. The interpolationfilters are a novel feature which facilitates interfacing the motionestimator to a motion compensated temporal interpolator. Temporal lowpass filtering is normally performed as part of (all 3 of) theinterpolation filters. The temporal filter (14) has been re-positionedin the processing path so that only one rather than three filters arerequired. Note that the filters prior to the multiplier array can beimplemented in any order because they are linear filters. The summationof gradient products, specified in equation 3, are implemented by thelow pass filters (24) following the multiplier array (22). Typicallythese filters would be (spatial) running average filters, which giveequal weight to each tap with their region of support. Other lowpassfilters could also be used at the expense of more complex hardware. Thesize of these filters (24) determines the size of the neighbourhood usedto calculate the best fitting motion vector. Examples of filtercoefficients which may be used can be found in the example.

A block diagram of apparatus capable of implementing equation 6 andwhich replaces that of FIG. 3, is shown in FIGS. 4 and 5.

Each of the ‘eigen analysis’ blocks (30), in FIG. 4, performs theanalysis for one of the two eigenvectors. The output of theeigen-analysis is a vector (with x and y components) equal to${si} = {e_{i} \cdot \sqrt{{\lambda_{i}/\lambda_{i}^{2}} + n^{2}}}$

These ‘s’ vectors are combined with vector (σ_(xt) ², σ_(yt) ²) (denotedc in FIG. 4), according to equation 6, to give the motion vectoraccording to the Martinez technique.

The eigen analysis, illustrated in FIG. 5, has been carefully structuredso that it can be implemented using lookup tables with no more than 2inputs. This has been done since lookup tables with 3 or more inputswould be impracticably large using today's technology. Theimplementation of FIG. 5 is based on first normalising the matrix M bydividing all its elements by (σ_(xx) ²+σ_(yy) ²). This yields a newmatrix, N, with the same eigenvectors (e₁ & e₂) and different (butrelated) eigenvalues (X₁ & X₂). The relationship between M,N and theireigenvectors and values is given by Equation 7;$N = {{\frac{1}{\sigma_{xx}^{2} + \sigma_{yy}^{2}}\quad M} = \begin{bmatrix}\frac{\sigma_{xt}^{2}}{\sigma_{xx}^{2} + \sigma_{yy}^{2}} & \frac{\sigma_{ty}^{2}}{\sigma_{xx}^{2} + \sigma_{yy}^{2}} \\\frac{\sigma_{xy}^{2}}{\sigma_{xx}^{2} + \sigma_{yy}^{2}} & \frac{\sigma_{yy}^{2}}{\sigma_{xx}^{2} + \sigma_{yy}^{2}}\end{bmatrix}}$

 M·e ₁=λ₁ ·e ₁

N·e ₁=χ₁ ·e ₁

λ₁=(σ_(xx) ²+σ_(yy) ²)χ₁

n ₈₀ =(σ_(xx) ²+σ_(yy) ²)n ₁₀₂

Matrix N is simpler than M as it contains only two independent values,since the principle diagonal elements (N_(1,1), N_(2,2)) sum to unityand the minor diagonal elements (N_(1,2), N_(2,1)) are identical. Theprincipal diagonal elements may be coded as (σ_(xx) ²−σ_(yy) ²)/(σ_(xx)²+σ_(yy) ²) since Equation 8;$N_{1,1} = {\frac{1}{2}\left( {1 + \left( \frac{\sigma_{xx}^{2} - \sigma_{yy}^{2}}{\sigma_{xx}^{2} + \sigma_{yy}^{2}} \right)} \right)}$$N_{2,2} = {\frac{1}{2}\left( {1 - \left( \frac{\sigma_{xx}^{2} - \sigma_{yy}^{2}}{\sigma_{xx}^{2} + \sigma_{yy}^{2}} \right)} \right)}$

Hence lookup tables 1 & 2 have all the information they require to findthe eigenvalues and vectors of N using standard techniques. It istherefore straightforward to precalculate the contents of these lookuptables. Lookup table 3 simply implements the square root function. Thekey features of the apparatus shown in FIG. 5 are that the eigenanalysis is performed on the normalised matrix, N, using 2 input lookuptables (1 & 2) and the eigenvalue analysis (from table 2) is rescaled tothe correct value using the output of table 3.

The gradient motion estimator described above is undesirably complex.The motion estimator is robust to images containing limited informationbut FIGS. 4 and 5 show the considerable complexity involved. Thesituation is made worse by the fact that many of the signals have a verywide dynamic range making the functional blocks illustrated much moredifficult to implement.

A technique which yields considerable simplifications withoutsacrificing performance. This is based on normalising the basicconstraint equation (equation 2) to control the dynamic range of thesignals. As well as reducing dynamic range this also makes othersimplifications possible.

Dividing the constraint equation by the modulus of the gradient vectoryields a normalised constraint equation i.e. Equation 9:$\frac{{u\frac{\partial I}{\partial x}} + {v\frac{\partial I}{\partial y}}}{{\nabla\quad I}} = {- \quad \frac{\frac{\partial I}{\partial t}}{{\nabla\quad I}}}$

where ${\nabla I} = {{{\begin{bmatrix}\frac{\partial I}{\partial x} \\\frac{\partial I}{\partial y}\end{bmatrix}\quad\&}\quad {{\nabla\quad I}}} = \sqrt{\left( \frac{\partial I}{\partial x} \right)^{2} + \left( \frac{\partial I}{\partial y} \right)^{2}}}$

The significance of this normalisation step becomes more apparent ifequation 9 is rewritten as Equation 10;

u·cos(θ)+v·sin(θ)=vn

where${{\cos \quad (\theta)} = \frac{\frac{\partial I}{\partial x}}{{\nabla\quad I}}},\quad {{\sin \quad (\theta)} = \frac{\frac{\partial I}{\partial y}}{{\nabla\quad I}}},\quad {{v\quad n} = {- \quad \frac{\frac{\partial I}{\partial t}}{{\nabla\quad I}}}}$

in which θ is the angle between the spatial image gradient vector (∇I)and the horizontal; vn is the motion speed in the direction of the imagegradient vector, that is, normal to the predominant edge in the pictureat that point. This seems a much more intuitive equation relating, as itdoes, the motion vector to the image gradient and the motion speed inthe direction of the image gradient. The coefficients of equation10(cos(θ)) & sin(θ)) have a well defined range (0 to 1) and,approximately the same dynamic range as the input signal (typically 8bits). Similarly vn has a maximum (sensible) value determined by thedesired motion vector measurement range. Values of vn greater than themaximum measurement range, which could result from either noise or‘cuts’ in the input picture sequence, can reasonably be clipped to themaximum sensible motion speed.

The normalised constraint equation 10 can be solved to find the motionvector in the same way as the unnormalised constraint equation 2. Withnormalisation equation 3 becomes Equation 11; ${\begin{bmatrix}{\sum{\cos^{2}(\theta)}} & {\sum{{\cos (\theta)} \cdot {\sin (\theta)}}} \\{\sum{{\cos (\theta)} \cdot {\sin (\theta)}}} & {\sum{\sin^{2}(\theta)}}\end{bmatrix} \cdot \begin{bmatrix}u_{0} \\v_{0}\end{bmatrix}} = \begin{bmatrix}{\sum{v\quad {n \cdot {\cos (\theta)}}}} \\{\sum{v\quad {n \cdot {\sin (\theta)}}}}\end{bmatrix}$ ${\text{or:}\quad {\Phi \cdot \begin{bmatrix}u_{0} \\v_{0}\end{bmatrix}}} = \psi$

In fact matrix (φ) has only 2 independent elements, sincecos²(x)+sin²(x)=1. This is more clearly seen by rewriting cos²(x) andsin²(x) as ½(1±cos(2x)) hence equation 11 becomes Equation 12;${\frac{1}{2} \cdot \left( {{N \cdot I} + \begin{bmatrix}{\sum{\cos \left( {2\theta} \right)}} & {\sum{\sin \left( {2\theta} \right)}} \\{\sum{\sin \left( {2\theta} \right)}} & {- {\sum{\cos \left( {2\theta} \right)}}}\end{bmatrix}} \right) \cdot \begin{bmatrix}u_{0} \\v_{0}\end{bmatrix}} = \begin{bmatrix}{\sum{v\quad {n \cdot {\cos (\theta)}}}} \\{\sum{v\quad {n \cdot {\sin (\theta)}}}}\end{bmatrix}$

where I is the (2×2) identity matrix and N is the number of pixelsincluded in the summations. Again the motion vector can be found usingequation 13: $\begin{bmatrix}u_{0} \\v_{0}\end{bmatrix} = {{- \left( {{\frac{\lambda_{1}}{\lambda_{1}^{2} + n_{1}^{2}}\quad e_{1}e_{1}^{t}} + {\frac{\lambda_{2}}{\lambda_{2}^{2} + n_{2}^{2}}\quad e_{2}e_{2}^{t}}} \right)} \cdot \begin{bmatrix}{\sum{v\quad {n \cdot {\cos (\theta)}}}} \\{\sum{v\quad {n \cdot {\sin (\theta)}}}}\end{bmatrix}}$

where now e and λ are the eigenvectors and eigenvalues of φ rather thanM. Now, because φ only has two independent elements, the eigen-analysiscan now be performed using only three, two-input, lookup tables,furthermore the dynamic range of the elements of φ (equation 11) is muchless than the elements of M thereby greatly simplifying the hardwarecomplexity.

A block diagram of a gradient motion estimator using Martinez techniqueand based on the normalised constraint equation is shown in FIGS. 6 & 7.

The apparatus of FIG. 6 performs the calculation of the normalisedconstraint equation (equation 10) for each pixel or data value.obviously, if prefiltering is performed the number of independent pixelvalues is reduced, the effective pixel size is greater. The filtering inFIG. 6 is identical to that in FIG. 2. The spatial image gradientsconverted to the output standard are used as inputs for a rectangular topolar coordinate converter (32) which calculates the magnitude of thespatial image vector and the angle θ. A suitable converter can beobtained from Raytheon (Coordinate transformer, model TMC 2330). Alookup table (34) is used to avoid division by very small numbers whenthere is no detail in a region of the input image. The constant term,‘n’, used in the lookup table is the measurement noise in estimating|∇I|, which depends on the input signal to noise ratio and theprefiltering used. A limiter (36) has also been introduced to restrictthe normal velocity, vn, to its expected range (determined by thespatial prefilter). The normal velocity might, otherwise, exceed itsexpected range when the constraint equation is violated, for example atpicture cuts. A key feature of FIG. 6 is that, due to the normalisationthat has been performed, the two outputs, vn & θ, have a much smallerdynamic range than the three image gradients in FIG. 2, thereby allowinga reduction in the hardware complexity.

In the apparatus of FIG. 6 the input video is first filtered usingseparate temporal, vertical and horizontal filters (10, 12, 14), theimage gradients are calculated using three differentiating filters (16,18) and then converted, from the input lattice, to the output samplinglattice using three vertical/temporal interpolators (20), typicallybilinear or other polyphase linear filters. For example, with a625/50/2:1 input the image gradients are calculated on a 525/60/2:1lattice.

The parameters of the normalised constraint equation, vn & θ, arecalculated as shown.

The apparatus of FIG. 7 calculates the best fitting motion vector,corresponding to a region of the input image, from the constraintequations for the pixels in that region. The summations specified inequation 12 are implemented by the lowpass filters (38) following thepolar to rectangular coordinate converter (40) and lookuptables 1 & 2.Typically these filters (38) would be (spatial) running average filters,which give equal weight to each tap within their region of support.Other lowpass filters could also be used at the expense of more complexhardware. The size of these filters (38) determine the size of theneighbourhood used to calculate the best fitting motion vector. Lookuptables 1 & 2 are simply cosine and sine lookup tables. Lookup tables 3to 5 contain precalculated values of matrix ‘Z’ defined by Equation 14;$Z = \left( {{\frac{\lambda_{1}}{\lambda_{1}^{2} + n_{1}^{2}}\quad e_{1}e_{1}^{t}} + {\frac{\lambda_{2}}{\lambda_{2}^{2} + n_{2}^{2}}\quad e_{2}e_{2}^{t}}} \right)$

where e and λ are the eigenvectors and eigenvalues of φ. Alternatively Zcould be φ⁻¹ (ie. assuming no noise), but this would not apply theMartinez technique and would give inferior results. A key feature ofFIG. 7 is that the elements of matrix Z are derived using 2 inputlookuptables. Their inputs are the output from the two lowpass filters(39) which have a small dynamic range allowing the use of small lookuptables.

The implementations of the gradient motion techniques discussed aboveseek to find the ‘best’ motion vector for a region of the input picture.However it is only appropriate to use this motion vector, for motioncompensated processing, if it is reasonably accurate. Whilst thedetermined motion vector is the ‘best fit’ this does not necessarilyimply that it is also an accurate vector. The use of inaccurate motionvectors, in performing motion compensated temporal interpolation,results in objectionable impairments to the interpolated image. To avoidthese impairments it is desirable to revert to a non-motion compensatedinterpolation algorithm when the motion vector cannot be measuredaccurately. To do this it is necessary to know the accuracy of theestimated motion vectors. If a measure of vector accuracy is availablethen the interpolation method can be varied between ‘full motioncompensation’ and no motion compensation depending on vector accuracy, atechnique known as ‘graceful fallback’ described in references 4 & 16.

A technique for measuring the accuracy of motion vectors is based on theuse of the constraint equation and hence is particularly suitable foruse with gradient based motion estimation techniques as described above.The method, however, is more general than this and could also be used toestimate the accuracy of motion vectors measured in other ways. Themeasurement of the accuracy of motion vectors is a new technique. Mostof the literature on motion estimation concentrates almost wholly onways of determining the ‘best’ motion vector and pays scant regard toconsidering whether the resulting motion vectors are actually accurate.This may, in part, explain why motion compensated processing is,typically, unreliable for certain types of input image.

Once a motion vector has been estimated for a region of an image anerror may be calculated for each pixel within that region. That error isan indication of how accurately the motion vector satisfies theconstraint equation or the normalised constraint equation (equations 2and 10 above respectively). The following discussion will use thenormalised constraint equation as this seems a more objective choice butthe unnormalised constraint equation could also be used with minorchanges (the use of the unnormalised constraint equation amounts togiving greater prominence to pixels with larger image gradients). Forthe i^(th) pixel within the analysis region the error is given byEquation 15;

error_(i) =vn _(i) −u ₀ cos(θ_(i))−v ₀ sin(θ_(i))

(for all i when 1≦i≦N, where N is the number of pixels in the analysisregion).

This error corresponds to the distance of the ‘best’ motion vector, (u₀,v₀), from the constraint line for that pixel (see FIG. 1). Note thatequation 11 above gives a motion vector which minimises the sum of thesquares of these errors. Each error value is associated with thedirection of the image gradient for that pixel. Hence the errors arebetter described as an error vector, E_(i), illustrated in FIG. 1 anddefined by Equation 16;

E _(i) ^(t)=error_(i)·[cos(θ), sin(θ)]

where superscript t represents the transpose operation.

The set of error vectors, {E_(i)}, form a two dimensional distributionof errors in motion vector space, illustrated in FIG. 8 below. Thisdistribution of motion vector measurement errors would be expected to bea two dimensional Gaussian (or Normal) distribution. Conceptually thedistribution occupies an elliptical region around the true motionvector. The ellipse defines the area in which most of the estimates ofthe motion vector would lie; the ‘best’ motion vector points to thecentre of the ellipse. FIG. 8 illustrates the ‘best’ motion vector, (u₀,v₀), and 4 typical error vectors, E₁ to E₄. The distribution of motionvector measurement errors is characterised by the orientation and lengthof the major and minor axes (σ₁, σ₂) of the ellipse. To calculate thecharacteristics of this distribution we must first form the (N×2) matrixdefined as Equation 17; $E = {\begin{bmatrix}E_{1}^{t} \\E_{2}^{t} \\\vdots \\E_{N}^{t}\end{bmatrix} = \begin{bmatrix}{{error}_{1} \cdot {\cos \left( \theta_{1} \right)}} & {{error}_{1} \cdot {\sin \left( \theta_{1} \right)}} \\{{error}_{2} \cdot {\cos \left( \theta_{2} \right)}} & {{error}_{2} \cdot {\sin \left( \theta_{2} \right)}} \\\quad & \quad \\{{error}_{N} \cdot {\cos \left( \theta_{N} \right)}} & {{error}_{N} \cdot {\sin \left( \theta_{N} \right)}}\end{bmatrix}}$

The length and orientation of the axes of the error distribution aregiven by eigenvector analysis of E^(t)·E; the eigenvectors point alongthe axes of the distribution and the eigenvalues, N_(total)·σ₁ ² &N_(total)·σ₂ ² (where N_(total) is the total number of pixels in theregion used to estimate the errors), give their length (see FIG. 8) thatis Equation 18;

(E ^(t) ·E)·e _(i) =N _(total)σ_(i) ·e _(i) where i=1 or 2

The matrix E^(t)·E (henceforth the ‘error matrix’ and denoted Q forbrevity) can be expanded to give Equation 19;${E^{t} \cdot E} = \begin{bmatrix}{\sum{{error}^{2} \cdot {\cos^{2}(\theta)}}} & {\sum{{error}^{2} \cdot {\cos (\theta)} \cdot {\sin (\theta)}}} \\{\sum{{error}^{2} \cdot {\cos (\theta)} \cdot {\sin (\theta)}}} & {\sum{{error}^{2} \cdot {\sin^{2}(\theta)}}}\end{bmatrix}$

where the summation is over a region of the image.

The likely motion vector error depends on how the motion vector wasmeasured. If the motion vector was calculated using, for example, blockmatching then the likely error would be approximately as determined bythe above analysis. However it is quite likely that this errorestimation technique of the invention would be applied to motion vectorscalculated using gradient (constraint equation) based motion estimation.In this latter case the motion vector is, itself, effectively the‘average’ of many measurements (i.e. 1 measurement per constraintequation used). Hence the error in the gradient based motion vector isless than the error estimated from the ‘error matrix’ above. This is anexample of the well known effect of taking an average of manymeasurements to improve the accuracy. If larger picture regions are usedfor gradient motion estimation then more accurate motion vectors areobtained (at the expense, of course, of being unable to resolve smallobjects). By contrast taking larger regions in a block matching motionestimator does not necessarily increase the vector accuracy (assumingthe selected vector is correct), it does however reduce the chance ofmeasuring a ‘spurious’ vector.

The likely error in the motion vector may be less than the ‘size’ of thedistribution of error vectors. The reduction is specified by a parameterN_(effective) which depends on how the motion vector was measured. Forblock matching, N_(effective) would be approximately 1. For gradientmotion estimation N_(effective) might be as high as the number of pixelsused in the measurement. It is more likely, however, that N_(effective)is less than the number of pixels due to the effects of prefiltering thevideo prior to motion estimation. Prefiltering effectively ‘enlarges’the pixels (i.e. individual pixels are not longer independent) reducingthe effective number of pixels (N_(effective)) Typically the region ofthe image used both to calculate the motion vector and estimate itserror might be 3 times the ‘size’ (both horizontally and vertically) ofthe prefilter used. This would give a typical value for N_(effective) of3². For a given value of N_(effective) the size of the errordistribution, calculated above, must be reduced by the square root ofN_(effective). This is the well known result for the reduction in errordue to averaging effective measurements. Thus, for a typical gradientbased motion estimator in which N_(effective) is 9, the likely error inthe measured motion vector is 3 times less than the distribution ofvector errors calculated above.

In an embodiment, the averaging filter is 95 pixels by 47 field linesso, the total number (N_(total) in FIG. 10) of pixels is 4465. Theeffective number of pixels (N_(effective)) used in error estimation willbe less than the total number of pixels if prefiltering is performed. Inthe specification of the gradient motion estimator parameters in theexample, the spatial pre-filter is {fraction (1/16)}^(th) band verticalintra-field and {fraction (1/32)}^(nd) band horizontal. The errorestimation region is 3 times the effective size of the spatialpre-filters both horizontally and vertically, giving an effective numberof pixels used in the selected error estimation region of 9.

To calculate the distribution of motion vector measurement errors it isnecessary to first calculate the elements of the error matrix, accordingto equation 19, then calculate its eigenvectors and eigenvalues. Theelements of the error matrix may be calculated by the apparatus of FIG.9. Other implementations are possible, but FIG. 9 is straight forwardand efficient. The inputs to FIG. 9, θ and vn, may be derived as in FIG.6. The motion vector input to FIG. 9, (u, v), could be derived as inFIG. 7, however it could equally well come from any other source such asFIG. 3 or 4 or even a block matching motion estimator. The lookup tables(1 and 2) are simply cosine and sine tables and, as in FIGS. 2 & 7, therequired summations are performed using spatial lowpass filters (42)such as running average filters.

Once the error matrix has been calculated (e.g. as in FIG. 9) itseigenvalues and eigenvectors may be found using the implementation ofFIG. 10 whose inputs are the elements of the error matrix, i.e.Σ(error²·cos²(θ)), Σ(error²·cos(θ)·sin (θ)) and Σ(error²·sin²(θ)),denoted Q₁₁, Q₁₂ and Q₂₂ respectively. Note that, as in FIG. 5, sincethere are two eigenvalues the implementation of FIG. 10 must beduplicated to generate both eigenvectors. As in FIG. 5. describedpreviously, the implementation of FIG. 10 has been carefully structuredso that it uses look up tables with no more than 2 inputs. In FIG. 10the output of lookup table 1 is the angular orientation of aneigenvector, that is the orientation of one of the principle axes of the(2 dimensional) error distribution. The output of lookup table 2, onceit has been rescaled by the output of lookup table 3, is inverselyproportional to the corresponding eigenvalue. An alternative function ofthe eigenvalue (other than its inverse) may be used depending on theapplication of the motion vector error information.

The spread vector outputs of FIG. 10 (i.e. (Sx_(i), Sy_(i)) i=1, 2)describe the likely motion vector measurement error for each motionvector in two dimensions. Since a video motion vector is a (2dimensional) vector quantity, two vectors are required to describe themeasurement error. In this implementation the spread vectors point alongthe principle axes of the distribution of vector measurement errors andtheir magnitude is the inverse of the standard deviation of measurementerror along these axes. If we assume, for example, that the measurementerrors are distributed as a 2 dimensional Gaussian distribution, thenthe probability distribution of the motion vector, v, is given byequation 20;

P(v)=(|S ₁ |·|S ₂|/2π)·exp(−(((v−v _(m))·S ₁)²+((v−v _(m))·S ₂)²)

where v_(m) is the measured motion vector and S₁ and S₂ are the twospread vectors. Of course, the motion vector measurement errors may nothave a Gaussian distribution but the spread vectors, defined above,still provide a useful measure of the error distribution. For someapplications it may be more convenient to define spread vectors whosemagnitude is a different function of the error matrix eigenvalues.

An alternative, simplified, output of FIG. 10 is a scalar confidencesignal rather than the spread vectors. This may be more convenient forsome applications. Such a signal may be derived from, r_(error), theproduct of the outputs of lookup tables 3 and 4 in FIG. 10, whichprovides a scalar indication of the motion vector measurement error.

The confidence signal may then be used to implement graceful fallback ina motion compensated image interpolator as described in reference 4. Ther_(error) signal is a scalar, average, measure of motion vector error.It assumes that the error distribution is isotropic and, whilst this maynot be justified in some situations, it allows a simple confidencemeasure to be generated. Note that the scalar vector error, r_(error),is an objective function, of the video signal, whilst the derivedconfidence signal is an interpretation of it.

A confidence signal may be generated by assuming that there is a smallrange of vectors which shall be treated as correct. This predefinedrange of correct vectors will depend on the application. We may, forexample, define motion vectors to be correct if they are within, say,10% of the true motion vector. Outside the range of correct vectors weshall have decreasing confidence in the motion vector. The range ofcorrect motion vectors is the confidence region specified byr_(confident) which might, typically, be defined according to equation21;

r _(confident) =k·|v|+r ₀

where k is a small fraction (typically 10%) and r₀ is a small constant(typically 1 pixel/field) and |v| is the measured motion speed. Theparameters k and r0 can be adjusted during testing to achieve bestresults. Hence the region of confidence is proportional to the measuredmotion speed accept at low speeds when it is a small constant. Theconfidence value is then calculated, for each output motion vector, asthe probability that the actual velocity is within the confidenceradius, r_(confident), of the measured velocity. This may be determinedby assuming a Gaussian probability distribution:${confidence} = {\frac{1}{2\quad \pi \quad r_{error}^{2}}{\int_{0}^{r_{confident}}{2\quad \pi \quad {x \cdot \exp}\quad \left( {{- \frac{1}{2}}\quad \frac{x^{2}}{r_{error}^{2}}} \right){x}}}}$

giving the following expression for vector confidence (equation 22):

confidence=1−exp(−½(r ² _(confidence) /r ² _(error)))

An embodiment of apparatus for estimating vector error is shown in FIGS.6, 9 and 10. The apparatus of FIG. 9 calculates the error matrix usingthe outputs from the apparatus of FIG. 6, which were generatedpreviously to estimate the motion vector. The error matrix input infigure, E^(t)·E, is denoted Q to simplify the labelling. The content oflookup tables 1 & 2 in FIG. 10 are defined by;

Look Up Table 1=angle(2y,−(x±(x ²+4y ²)))

Look Up Table 2=1/(2(1±(x ²+4y ²)))

Where$x = {{\frac{Q_{1,1} - Q_{2,2}}{Q_{1,1} + Q_{2,2}}\quad {and}\quad y} = \frac{Q_{1,2}}{Q_{1,1} + Q_{2,2}}}$

where the ‘angle(x, y)’ function gives the angle between the x axis andpoint (x, y) and where the positive sign is taken for one of the eigenanalysis units and the negative sign is taken for the other unit.

The input of lookup table 3 in FIG. 10 (Q₁₁+Q₂₂) is a dimensionedparameter (z) which describes the scale of the distribution of motionvector errors. The content of lookup table 3 is defined by(z/N_(total)·N_(effective)). The output of Lookup table 3 is a scalingfactor which can be used to scale the output of lookup table 2 definedabove. The input to the polar to rectangular coordinate converter is,therefore, related to the inverse of the length of each principle axisof the error distribution. Using different Lookup table it would bepossible to calculate the spread vectors directly in Cartesianco-ordinates.

The apparatus described in relation to FIG. 10, is capable of producingboth the spread vectors and the scalar confidence signal. The presentinvention encompasses methods and apparatus which generate only one suchparameter; either the confidence signal or the spread vectors, The eigenanalyses performed by the apparatus of FIG. 10 must be performed twiceto give both spread vectors for each principle axis of the errordistribution; only one implementation of FIG. 10 is required to generater_(error) and the derived confidence signal. The inputs to lookup table4 are the same as for lookup table 1 (x and y). The content of Lookuptable 4 is defined by ⁴(¼(1−x²)−y²) . The output of lookup table 4scaled by the output of lookup table 3 gives r_(error) a scalar(isotropic) vector error from which a confidence signal is generated inlookup table 5, the contents of which are defined by equation 22, forexample. r_(error) is the geometric mean of the length of the major andminor axes of the error distribution, that is, r_(error)=(σ₁·σ₂).

In FIGS. 7 and 9 picture resizing is allowed for using (intrafield)spatial interpolators (44) following the region averaging filters(38,39,42). Picture resizing is optional and is required for example foroverscan and aspect ratio conversion. The apparatus of FIG. 6 generatesits outputs on the nominal output, standard, that is assuming no pictureresizing. The conversion from input to (nominal) output standard isachieved using (bilinear) vertical/temporal interpolators (20).Superficially it might appear that these interpolators (20) could alsoperform the picture stretching or shrinking required for resizing.However, if this were done the region averaging filters (38,42) in FIGS.7 and 9 would have to vary in size with the resizing factor. This wouldbe very awkward for large picture expansions as very large regionaveraging filters (38,42) would be required. Picture resizing istherefore achieved after the region averaging filters using purelyspatial (intrafield) interpolators (44), for example bilinearinterpolators. In fact the function of the vertical/temporal filters(20) in FIG. 6 is, primarily, to interpolate to the output field rate.The only reason they also change the line rate is to maintain a constantdata rate.

Experimental Results

Experiments were performed to simulate the basic motion estimationalgorithm (FIGS. 2 & 3), use of the normalised constraint equation(FIGS. 6 & 7), the Martinez technique with the normalised constraintequation and estimation of vector measurement error (FIGS. 9 & 5). Ingeneral these experiments confirmed the theory and techniques describedabove.

Simulations were performed using a synthetic panning sequence. This wasdone both for convenience and because it allowed a precisely knownmotion to be generated. Sixteen field long interlaced sequences weregenerated from an image for different motion speeds. The simulationsuggests that the basic gradient motion estimation algorithm gives thecorrect motion vector with a (standard deviation) measurement error ofabout ±¼ pixel/field. The measured velocity at the edge of the picturegenerally tends towards zero because the filters used are not whollycontained within the image. Occasionally unrealistically high velocitiesare generated at the edge of image. The use of the normalised constraintequation gave similar results to the unnormalised equation. Use of theMartinez technique gave varying results depending on the level of noiseassumed. This technique never made things worse and could significantlyreduce worst case (and average) errors at the expense of biasing themeasured velocity towards zero. The estimates of the motion vector errorwere consistent with the true (measured) error.

EXAMPLE

This example provides a brief specification for a gradient motionestimator for use in a motion compensated standards converter. The inputfor this gradient motion estimator is interlaced video in either625/50/2:1 or 525/60/2:1 format. The motion estimator produces motionvectors on one of the two possible input standards and also anindication of the vector's accuracy on the same standard as the outputmotion vectors. The motion vector range is at least ±32 pixels/field.The vector accuracy is output as both a ‘spread vector’ and a‘confidence signal’.

A gradient motion estimator is shown in block diagram form in FIGS. 6 &7 above. Determination of the measurement error, indicated by ‘spreadvectors’ and ‘confidence’ are shown in FIGS. 9 & 10. The characteristicsof the functional blocks of these block diagrams is as follows:

Input Video:

4:2:2 raster scanned interlaced video.

luminance component only

Active field 720 pixel×288 or 244 field lines depending on inputstandard.

Luminance coding 10 bit, unsigned binary representing the range 0 to(2¹⁰−1)

Temporal Halfband Lowpass Filter (14):

Function: Temporal filter operating on luminance. Implemented as avertical/temporal filter because the input is interlaced. Thecoefficients are defined by the following matrix in which columnsrepresent fields and rows represent picture (not field) lines.$\text{Temporal Halfband filter coefficients} = {{1/8}\quad \begin{pmatrix}1 & 0 & 1 \\0 & 4 & 0 \\1 & 0 & 1\end{pmatrix}}$

Input: 10 bit unsigned binary representing the range 0 to 1023 (decimal)

Output: 12 bit unsigned binary representing the range 0 to 1023.75(decimal) with 2 fractional bits.

Vertical Lowpass Filter (12):

Function: Vertical intra field, {fraction (1/16)} band, lowpass,prefilter and anti-alias filter. Cascade of 3, vertical running sumfilters with lengths 16, 12 and 5 field lines. The output of thiscascade of running sums is divided by 1024 to give an overall D.C. gainof 15/16. The overall length of the filter is 31 field lines.

Input: As Temporal Halfband Lowpass Filter output.

Output: As Temporal Halfband Lowpass Filter output.

Horizontal Lowpass Filter (10):

Function: Horizontal, {fraction (1/32)}^(nd) band, lowpass, prefilter.Cascade of 3, horizontal, running sum filters with lengths 32, 21 and 12pixels. The output of this cascade is divided by 8192 to give an overallD.C. gain of 63/64. The overall length of the filter is 63 pixels.

Input: As Vertical Lowpass Filter output.

Output: As Vertical Lowpass Filter output.

Temporal Differentiator (16):

Function: Temporal differentiation of prefiltered luminance signal.Implemented as a vertical/temporal filter for interlaced inputs.$\text{Temporal Differentiator coefficients} = {{1/4}\quad \begin{pmatrix}1 & 0 & {- 1} \\0 & 0 & 0 \\1 & 0 & {- 1}\end{pmatrix}}$

Input: As Horizontal Lowpass Filter output.

Output: 12 bit 2's complement binary representing the range −2⁹ to(+2⁹−2⁻²).

Horizontal Differentiator (17):

Function: Horizontal differentiation of prefiltered luminance signal. 3tap horizontal filter with coefficients ½(1, 0, −1) on consecutivepixels.

Input: As Horizontal Lowpass Filter output.

Output: 8 bit 2's complement binary representing the range −2⁴ to(+2⁴−2⁻³).

Vertical Differentiator (18):

Function: Vertical differentiation of prefiltered luminance signal. 3tap, intra-field, vertical filter with coefficients ½(1, 0, −1) onconsecutive field lines.

Input: As Horizontal Lowpass Filter output.

Output: 8 bit 2's complement binary representing the range −2⁴ to(+2⁴−2⁻³)

Compensating Delay (19):

Function: Delay of 1 input field.

Input & Output: As Horizontal Lowpass Filter output.

Vertical/Temporal Interpolators (20):

Function: Conversion between input and output scanning standards.Cascade of intra field, 2 field line linear interpolator and 2 fieldlinear interpolator, i.e. a vertical/temporal bi-linear interpolator.Interpolation accuracy to nearest {fraction (1/32)}^(nd) field line andnearest {fraction (1/16)}th field period.

Inputs: as indicated in FIG. 6 and specified above.

Outputs: same precision as inputs.

θ: Orientation of spatial gradient vector of image brightness. 12 bitunipolar binary spanning the range 0 to 2π i.e. quantisation step is2π/2¹². This is the same as 2's complement binary spanning the range −πto +π.

|∇I|: Magnitude of spatial gradient vector of image brightness. 12 bitunipolar binary spanning the range 0 to 16 (input grey levels/pixel)with 8 fractional bits.

n: Noise level of |∇I| adjustable from 1 to 16 input grey levels/pixel.

vn: Motion vector of current pixel in direction of brightness gradient.12 bit, 2's complement binary clipped to the range 2⁶ to (+2⁶−2⁻⁵)pixels/field.

Polar to Rectangular Co-ordinate Converter (40):

Inputs: as vn & θ above

Outputs: 12 bit, 2's complement binary representing the range −2⁶ to(+2⁶−2⁻⁵)

Lookup Tables No.1 & No.2 (FIGS. 7 and 9)

Function: Cosine and Sine lookup tables respectively.

Inputs: as θ above.

Outputs: 12 bit, 2's complement binary representing the range 1 to(+1−2⁻¹¹)

Region Averaging Filters (38, 39, 42):

Function: Averaging signals over a region of the image. 95 pixels by 47field lines, intrafield, running average filter.

Inputs & Outputs: 12 bit 2's complement binary.

Spatial Interpolators (44):

Function: Converting spatial scanning to allow for picture resizing.Spatial, intrafield bilinear interpolator.

Interpolation accuracy to nearest {fraction (1/32)}nd field line andnearest {fraction (1/16)}th pixel.

Inputs: 12 bit 2's complement binary.

Outputs: 12 or 8/9 bit 2's complement binary.

Upper Interpolators feeding multipliers 12 bit.

Lower Interpolators feeding Lookup tables 8/9 bit (to ensure a practicalsize table).

Look Up Tables 3 to 5 (FIG. 7):

Function: Calculating matrix ‘Z’ defined in equation 14 above.

Parameters n₁ & n₂ adjust on test (approx. 2-5).

Inputs: 8/9 bit 2's complement binary representing −1 to (approx.) +1.

Outputs: 12 bit 2's complement binary representing the range 16 to(+16−2-5).

Multipliers & Accumulators:

Inputs & Outputs: 12 bit 2's complement binary.

Motion Vector Output:

Output of FIG. 7.

Motion vectors are measure in input picture lines (not field lines) orhorizontal pixels per input field period.

Motion speeds are unlikely to exceed +48 pixels/field but an extra bitis provided for headroom.

Raster scanned interlaced fields.

Active field depends on output standard: 720 pixels×288 or 241 fieldlines.

12 bit signal, 2's complement coding, 8 integer and 4 fractional bitsrepresenting the range −128 to (+128−2⁴)

Spread vectors S₁ and S₂ (Output of FIG. 10):

Spread vectors represent the measurement spread of the output motionvectors parallel and perpendicular to edges in the input image sequence.

The spread vectors are of magnitude σ⁻¹ (where σ represents standarddeviation) and point in the direction of the principle axes of theexpected distribution of measurement error. Each spread vector has twocomponents each coded using two complement fractional binaryrepresenting the range −1 to (+1−2⁻⁷).

Confidence Output:

Output of FIG. 10, derivation of confidence signal described above.

The confidence signal is an indication of the reliability of the ‘OutputMotion Vector’. Confidence of 1 represents high confidence, 0 representsno confidence.

The confidence signal uses 8 bit linear coding with 8 fractional bitsrepresenting the range 0 to (1−2⁻⁸).

REFERENCES

1. Aggarwal, J. K. & Nandhakumar. N. 1988. On the computation of motionfrom sequences of images—a review. Proc. IEEE, vol. 76, pp. 917-935,August 1988.

2. Bierling, M., Thoma, R. 1986. Motion compensating field interpolationusing a hierarchically structured displacement estimator. SignalProcessing, Volume 11, No. 4, December 1986, pp. 387-404. ElsevierScience publishers.

3. Borer, T. J., 1992. Television Standards Conversion. Ph.D. Thesis,Dept. Electronic & Electrical Engineering, University of Surrey,Guildford, Surrey,GU2 5XH, UTK. October 1992.

4. Borer, T. J., 1995. Adaptive Motion trajectories for interpolatingmoving images. UK Patent Application No. 9519311.6, filed Sep. 21, 1995.

5. Cafforio, C., Rocca, F. 1983. The differential method for imagemotion estimation. Image sequence processing and dynamic scene analysis(ed.T. S. Huang). Springer-Verlag, pp 104-124, 1983.

6. Cafforo, C., Rocca, F., Tubaro. S., 1990. Motion Compensated imageInterpolation. IEEE Trans. on Comm. Vol. 38, No. 2, February 1990,pp215-222.

7. Dubois, E., Konrad, J., 1990. Review of techniques for motionestimation and motion compensation. Fourth International Colloquium onAdvanced Television Systems, Ottawa, Canada, June 1990. Organised by CBCEngineering, Montreal, Quebec, Canada.

8. Fennema, C. L., Thompson, W. B., 1979. Velocity determination inscenes containing several moving objects. Computer Vision, Graphics andImage Processing, Vol. 9, pp 301-315, 1979.

9. Huahge, T. S., Tsai, R. Y., 1981. Image sequence analysis:Motionestimation. Image sequence analysis, T. S. Huange (editor),Springer;Verlag, Berlin, Germany, 1981, pp. 1-18.

10. Konrad, J., 1990. Issues of accuracy and complexity inmotioncompensation for ATV systems. Contribution to ‘Les Assises DesJeunesChercheurs’, CBC, Montreal June 1990.

11. Lim, J. S., 1990. Two-dimensional signal and image processsingPrenticeHall 1990, ISBN 0-13-934563-9, pp 497—511.

12. Martinez, D. M. 1987. Model-based motion estimation and itsapplication to restoration and interpolation of motion pictures. RLETechnical Report No.530. June 1987. Research Laboratory of Electronics,Massachusetts Institute of Technology, Cambridge, Mass. 02139 USA.

13. Netravali , A. N., Robbins. J. D. 1979. Motion compensatedtelevision coding, Part 1. Bell Syst. Tech. J., vol. 58, pp 631-670,March 1979.

14. Paquin, R., Dubois, E., 1983. A spatlo-temporal gradient method forestimating the displacement vector field in time-varying imagery.ComputerVision, Graphics and Image Processing, Vol. 21, 1983, pp205-221.

15. Robert, P., Cafforio, C., Rocca, F., 1985. Time/Space recursion fordifferential motion estimation. Spie Symp., Cannes, France, November1985.

16. Thomson, R. 1995. Problems of Estimation and Measurement of Motionin Television. I.E.E. colloquium on motion reproduction in television.I.E.E Digest No: 1995/093, May 3, 1995.

17. Vega-riveros, J. F., Jabbour, K. 1986. Review of motion analysistechniques. IEE Proceedings. Vol. 136, Pt I., No. 6, December 1989.

18. Wu, S. F., Kittler, J., 1990. A differential method for thesimultaneous estimation of rotation, change of scale and translation.ImageCommunication, Vol. 2, No. 1, May 1990, pp 69-80

What is claimed is:
 1. A motion vector estimation apparatus for use invideo signal processing which is adapted to generate motion vectorsampled on an output sampling lattice, comprising: a first means forspatially filtering an input signal sampled on an input lattice; asecond means for means operating on said input signal for calculatingimage gradients sampled on said input lattice; a third means forconverting the signal sampled on said input lattice to a signal sampleon said output sampling lattice said first, second and third meansoperating on said input signal in any order and having a plurality ofimage gradients sampled on said output lattice as an output; and afourth means for calculating motion vectors, wherein said fourth meansfor calculating motion vectors has as an input said plurality of imagegradients sampled on said output sampling lattice.
 2. A motion vectorestimation apparatus as claimed in claim 1, wherein said first means forspatially filtering the signal includes spatial low pass filters.
 3. Amotion vector estimation apparatus as claimed in claim 1, wherein saidsecond means for calculating the image gradients includes temporal andspatial differentiators.
 4. A motion vector estimation apparatus asclaimed in claim 1, wherein said third means for converting the inputsignal includes a vertical/temporal interpolator.
 5. A motion vectorestimation apparatus as claimed in claim 1, and further comprising amultiplier array for calculating image gradient products from aplurality of said image gradients generated on the output standard,filters for summing said plurality of image gradient products, whereinsaid second means for calculating the motion vectors utilizes the sumsof a plurality of image gradient products to generate the best fitmotion vector.
 6. A motion vector estimation apparatus as claimed inclaim 5 wherein said best fit motion vector is determined in accordancewith the equation: $\begin{bmatrix}u_{0} \\v_{0}\end{bmatrix} = {{- \left( {{\frac{\lambda_{1}}{\lambda_{1}^{2} + n_{1}^{2}}\quad e_{1}e_{1}^{t}} + {\frac{\lambda_{2}}{\lambda_{2}^{2} + n_{2}^{2}}\quad e_{2}e_{2}^{t}}} \right)} \cdot \begin{bmatrix}\sigma_{xt}^{2} \\\sigma_{yt}^{2}\end{bmatrix}}$

where [u₀ v₀] is the best fitting motion vector, σ represents the rootsof sums of products of image gradients the subscript signifying theparticular image gradients, and e and λ are the eigen vectors and eigenvalues of M, where $M = \begin{bmatrix}\sigma_{xx}^{2} & \sigma_{xy}^{2} \\\sigma_{xy}^{2} & \sigma_{yy}^{2}\end{bmatrix}$


7. A motion vector estimation apparatus as claimed in claim 1, andfurther comprising means for calculating from a plurality of imagegradients generated on said output sampling lattice, the spatial imagegradient (|∇I|), the angle between the spatial image gradient and thehorizontal (θ) and the motion speed (vn) in the direction of the imagegradient vector.
 8. A motion vector estimation apparatus as claimed inclaim 7, wherein said means for calculating motion vectors calculatesthe best fitting motion vector in accordance with the equation:$\begin{bmatrix}u_{0} \\v_{0}\end{bmatrix} = {{- \left( {{\frac{\lambda_{1}}{\lambda_{1}^{2} + n_{1}^{2}}\quad e_{1}e_{1}^{t}} + {\frac{\lambda_{2}}{\lambda_{2}^{2} + n_{2}^{2}}\quad e_{2}e_{2}^{t}}} \right)} \cdot \begin{bmatrix}{\sum{v\quad {n \cdot {\cos (\theta)}}}} \\{\sum{v\quad {n \cdot {\sin (\theta)}}}}\end{bmatrix}}$

where e and λ are the eigen vectors and eigen values of:$\begin{bmatrix}{\sum{\cos^{2}(\theta)}} & {\sum{{\cos (\theta)} \cdot {\sin (\theta)}}} \\{\sum{{\cos (\theta)} \cdot {\sin (\theta)}}} & {\sum{\sin^{2}(\theta)}}\end{bmatrix}$


9. A method of estimating motion vectors on an output sampling latticefor use in video-signal processing, comprising the following steps: (a)spatially filtering a video signal, (b) converting the signal from aninput sampling lattice to said output sampling lattice, (c) calculatinga plurality of image gradients; wherein steps a to c are carried out inany order, and calculating motion vectors on said output samplinglattice from said image gradients generated on said output samplinglattice.
 10. A method of motion vector estimation as claimed in claim 9,wherein said step of spatially filtering the input signal includesfiltering by spatial low pass filters.
 11. A method of motion vectorestimation as claimed in claim 9, wherein calculating said plurality ofimage gradients includes temporally and spatially differentiating thespatially filtered signal.
 12. A method of motion vector estimation asclaimed in claim 9, wherein converting the signal includes vertical andtemporal interpolation.
 13. A method of motion vector estimation asclaimed in claim 9 and further comprising calculating image gradientproducts from a plurality of said image gradients generated on theoutput standard, summing said plurality of image gradient products,wherein calculating the motion vectors comprises utilizing the sums of aplurality of image gradient products to generate the best fit motionvector.
 14. A method of motion vector estimation as claimed in claim 13,wherein said best fit motion vector is determined in accordance with theequation: $\begin{bmatrix}u_{0} \\v_{0}\end{bmatrix} = {{- \left( {{\frac{\lambda_{1}}{\lambda_{1}^{2} + n_{1}^{2}}\quad e_{1}e_{1}^{t}} + {\frac{\lambda_{2}}{\lambda_{2}^{2} + n_{2}^{2}}\quad e_{2}e_{2}^{t}}} \right)} \cdot \begin{bmatrix}\sigma_{xt}^{2} \\\sigma_{yt}^{2}\end{bmatrix}}$

where [u₀ v₀] is the best fitting motion vector, σ represents the rootsof sums of products of image gradients the subscript signifying theparticular image gradients, and e and λ are the eigen vectors and eigenvalues of M, where $M = \begin{bmatrix}\sigma_{xx}^{2} & \sigma_{xy}^{2} \\\sigma_{xy}^{2} & \sigma_{yy}^{2}\end{bmatrix}$


15. A method of motion vector estimation as claimed in claim 9, andfurther comprising calculating from a plurality of image gradientsgenerated on said output sampling lattice, the spatial image gradient(|∇I|), the angle between the spatial image gradient and the horizontal(θ) and the motion speed (vn) in the direction of the image gradientvector.
 16. A method of motion vector estimation as claimed in claim 15,wherein said means for calculating motion vectors calculates the bestfitting motion vector in accordance with the equation: $\begin{bmatrix}u_{0} \\v_{0}\end{bmatrix} = {{- \left( {{\frac{\lambda_{1}}{\lambda_{1}^{2} + n_{1}^{2}}\quad e_{1}e_{1}^{t}} + {\frac{\lambda_{2}}{\lambda_{2}^{2} + n_{2}^{2}}\quad e_{2}e_{2}^{t}}} \right)} \cdot \begin{bmatrix}{\sum{v\quad {n \cdot {\cos (\theta)}}}} \\{\sum{v\quad {n \cdot {\sin (\theta)}}}}\end{bmatrix}}$

where e and λ are the eigen vectors and eigen values of:$\begin{bmatrix}{\sum{\cos^{2}(\theta)}} & {\sum{{\cos (\theta)} \cdot {\sin (\theta)}}} \\{\sum{{\cos (\theta)} \cdot {\sin (\theta)}}} & {\sum{\sin^{2}(\theta)}}\end{bmatrix}$